Skip to content

Conversation

GiorgosTriantafyllou324

Added a non-zero C matrix tile for the sgemm_tcu testbench. Up to this point, the operation was C = A * B. Now the operation at the sgemm_tcu kernel.cpp is D = A * B + C.

Also, I changed sim/common/tensor_cfg.h to make the number of bits match the int8_t type (8 instead of 16)

Copy link
Contributor

@tinebp tinebp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please remove all changes to sgemm_tpu and limit the PR to only fixing the cfg bug.?
We prefer to keep this sgemm_tcu test compatible with the basic CUDA tensor core tutorials.

Copy link
Author

@GiorgosTriantafyllou324 GiorgosTriantafyllou324 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the fixes! Is it ready for merge?

Copy link
Contributor

@tinebp tinebp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to create a separate pull request with isolated config change.
we will delete this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants